WebMaster: Knowledge-Based Verification of Web-Pages
نویسندگان
چکیده
Maintaining contents of Web sites is an open and urgent problem on the current World Wide Web as well as on company intra-nets. Although many current tools deal with problems such as broken links and missing images, very few solutions exist for maintaining the contents of Web sites and intra-nets. We present a knowledge-based approach to the verification of Web-page contents. The user exploits semantic markup in Webpages to formulate rules and constraints that must hold on the information in a site. An inference engine subsequently uses these rules to categorise Web-pages in an ontology of pages, while the constraints are used to define categories of pages which contain errors. We have constructed WebMaster, a software tool for knowledge-based verification of Web-pages. WebMaster allows the user to define rules and constraints in a graphical format, and is then able to use these rules to detect outdated, inconsistent and incomplete information in Web-pages. In this paper, we describe the various options for semantic markup on the Web, we define a precise logical and graphical format for rules and constraints, and we report on our practical experiences with WebMaster. Acknowledgements The work reported in this paper has only been possible with the contributions from all current and past members of the WebMaster team at AIdministrator: Jan Bakker, Chris Fluit, Herko ter Horst, Walter van Iterson, Arjohn Kampman and Gert-Jan van de Streek. Part I: “The business issue”
منابع مشابه
Knowledge-Based Validation, Aggregation, and Visualization of Meta-data: Analyzing a Web-Based Information System
As meta-data become of ever more importance to the Web, we will need to start managing such metadata. We briefly review existing approaches to meta-data management, and conclude that there is a strong need for meta-data validation and aggregation. In order to base our claims on a real scenario we briefly describe BUISY, an existing web-based geographic information system that exploits meta-data...
متن کاملUnURL: Unsupervised Learning from URLs
Web pages are identified by their URLs. For authoritative web pages, pages that are focused on a specific topic, webmasters tend to use URLs which summarize the page. URL information is good for clustering because, they are small and ubiquitous, making techniques based on just URL information magnitudes faster than those which make use of the text content as well. We present a system that makes...
متن کاملUse of Semantic Similarity and Web Usage Mining to Alleviate the Drawbacks of User-Based Collaborative Filtering Recommender Systems
One of the most famous methods for recommendation is user-based Collaborative Filtering (CF). This system compares active user’s items rating with historical rating records of other users to find similar users and recommending items which seems interesting to these similar users and have not been rated by the active user. As a way of computing recommendations, the ultimate goal of the user-ba...
متن کاملACE: An Adaptive CSS Engine for Web Pages and Web-based Applications
ACE is a system that tailors web interfaces to the users’ behavior without requiring end user intervention. By leveraging implicit interactions (e.g., tracking mouse or touch events), the visual appearance of page elements is subtly modified in an unsupervised and incremental manner. Such page elements (accessed by means of CSS selectors) and their alterable parts (defined as CSS properties) ar...
متن کاملAdaptive Sites: Automatically Learning from User Access Patterns
Designing a web site is a complex problem. Logs of user accesses to a site provide an opportunity to observe users interacting with that site and make improvements to the site’s structure and presentation. We propose adaptive sites: web sites that improve themselves by learning from user access patterns. Adaptive webs can make popular pages more accessible, highlight interesting links, connect ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999